Feature significance for multivariate kernel density estimation
نویسندگان
چکیده
Multivariate kernel density estimation provides information about structure in data. Feature significance is a technique for deciding whether features – such as local extrema – are statistically significant. This paper proposes a framework for feature significance in d-dimensional data which combines kernel density derivative estimators and hypothesis tests for modal regions. For the gradient and curvature estimators distributional properties are given, and pointwise test statistics are derived. The hypothesis tests extend the two-dimensional feature significance ideas of Godtliebsen et al. (2002). The theoretical framework is complemented by novel visualisation for three-dimensional data. Applications to real data sets show that tests based on the kernel curvature estimators perform well in identifying modal regions. These results can be enhanced by corresponding tests with kernel gradient estimators.
منابع مشابه
Ridge-Penalty Regularization for Kernel-CCA
CCA and Kernel-CCA are powerful statistical tools that have been successfully employed for feature extraction. However, when working in high-dimensional signal spaces, care has to be taken to avoid overfitting. This paper discusses the influence of ridge penalty regularization on kernel-CCA by relating it to multivariate linear regression(MLR) and partial least squares(PLS). Experimental result...
متن کاملA Bayesian approach to bandwidth selection for multivariate kernel density estimation
Kernel density estimation for multivariate data is an important technique that has a wide range of applications. However, it has received significantly less attention than its univariate counterpart. The lower level of interest in multivariate kernel density estimation is mainly due to the increased difficulty in deriving an optimal data-driven bandwidth as the dimension of the data increases. ...
متن کاملMultivariate Modality Inference Using Gaussian Kernel
The number of modes (also known as modality) of a kernel density estimator (KDE) draws lots of interests and is important in practice. In this paper, we develop an inference framework on the modality of a KDE under multivariate setting using Gaussian kernel. We applied the modal clustering method proposed by [1] for mode hunting. A test statistic and its asymptotic distribution are derived to a...
متن کاملComparison of the Gamma kernel and the orthogonal series methods of density estimation
The standard kernel density estimator suffers from a boundary bias issue for probability density function of distributions on the positive real line. The Gamma kernel estimators and orthogonal series estimators are two alternatives which are free of boundary bias. In this paper, a simulation study is conducted to compare small-sample performance of the Gamma kernel estimators and the orthog...
متن کاملBandwidth Selection for Multivariate Kernel Density Estimation Using MCMC
Kernel density estimation for multivariate data is an important technique that has a wide range of applications in econometrics and finance. However, it has received significantly less attention than its univariate counterpart. The lower level of interest in multivariate kernel density estimation is mainly due to the increased difficulty in deriving an optimal datadriven bandwidth as the dimens...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Statistics & Data Analysis
دوره 52 شماره
صفحات -
تاریخ انتشار 2008